fix: decouple visual styling from tbl_hierarchical_rate_by_grade to prevent Cartesian join explosion#228
fix: decouple visual styling from tbl_hierarchical_rate_by_grade to prevent Cartesian join explosion#228Melkiades wants to merge 61 commits into
Conversation
|
@jszczypinski sorry for adding you, I just asked for last committer ahah |
…revent Cartesian join explosion tbl_hierarchical_rate_by_grade() was blanking the label column for grade rows before returning, causing tbl_merge() inside tbl_with_pools() to lose row uniqueness and produce a Cartesian cross-join. Split into two functions: - tbl_hierarchical_rate_by_grade(): returns structurally pristine table with unique label values, injects custom_info metadata - add_grade_column(): post-processing function that applies all visual formatting (label_grade column, label blanking, headers, indentation) Co-authored-by: Ona <no-reply@ona.com>
…revent Cartesian join explosion tbl_hierarchical_rate_by_grade() was blanking the label column for grade rows before returning, causing tbl_merge() inside tbl_with_pools() to lose row uniqueness and produce a Cartesian cross-join. Split into two functions: - tbl_hierarchical_rate_by_grade(): returns structurally pristine table with unique label values, injects custom_info metadata - add_grade_column(): post-processing function that applies all visual formatting (label_grade column, label blanking, headers, indentation)
- add_grade_column() now returns early if label_grade already exists - metadata lookup iterates over x$tbls instead of hardcoding tbls[[1]] - added idempotency test (test #8) - fixed broken comment block in tbl_with_pools test Co-authored-by: Ona <no-reply@ona.com>
- add_grade_column() now returns early if label_grade already exists - metadata lookup iterates over x$tbls instead of hardcoding tbls[[1]] - added idempotency test (test #8) - fixed broken comment block in tbl_with_pools test
Co-authored-by: Ona <no-reply@ona.com>
1afacd6 to
8becdc8
Compare
|
@jszczypinski I did revert them ahah crazy stuff |
Melkiades
left a comment
There was a problem hiding this comment.
Self-review after fixes. All previously identified issues are resolved:
- Idempotency guard added (line 75): early return if
label_gradealready exists, prevents silent corruption on double-call. - Robust metadata lookup (line 78):
Find()iterates overx$tblsinstead of hardcodingtbls[[1]]. - Idempotency test added (test #8): verifies
table_bodyis identical after second call and grade labels are not corrupted. - Comment block fixed in
test-tbl_with_pools.R. - Docs updated to match implementation.
Remaining notes:
- Output snapshots for
tbl_hierarchical_rate_by_gradewere cleared and need regeneration on firstdevtools::test()run (testthat::snapshot_accept()). devtools::document()must be run to exportadd_grade_columnin NAMESPACE.
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+. Co-authored-by: Ona <no-reply@ona.com>
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+.
51ffb39 to
74c8419
Compare
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+. Also added AGENTS.md to .gitignore. Co-authored-by: Ona <no-reply@ona.com>
…elect deprecation Three tidyselect contexts in tbl_mmrm() were passing string variables (arm, visit) as bare names instead of using all_of(). This triggered deprecation warnings from tidyselect 1.1.0+. Also added AGENTS.md to .gitignore.
803c081 to
76f4c0b
Compare
Unit Tests Summary 1 files 233 suites 3m 10s ⏱️ Results for commit 76ef8f4. ♻️ This comment has been updated with latest results. |
Unit Test Performance Difference
Additional test case details
Results for commit 5e67faa ♻️ This comment has been updated with latest results. |
Code Coverage SummaryDiff against mainResults for commit: 76ef8f4 Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Tightly coupled function now shares the same file and help page via @Rdname. Removed standalone R/add_grade_column.R and its man page. Co-authored-by: Ona <no-reply@ona.com>
Tightly coupled function now shares the same file and help page via @Rdname. Removed standalone R/add_grade_column.R and its man page.
Co-authored-by: Ona <no-reply@ona.com>
…engineering/crane into js/fix-cartesian-join-explosion
…engineering/crane into js/fix-cartesian-join-explosion
…engineering/crane into js/fix-cartesian-join-explosion
…engineering/crane into js/fix-cartesian-join-explosion
Bare `label` inside modify_table_body() lambda is not visible to R CMD check. Use .data$label for all references. Co-authored-by: Ona <no-reply@ona.com>
Bare `label` inside modify_table_body() lambda is not visible to R CMD check. Use .data$label for all references.
Signed-off-by: Davide Garolini <dgarolini@gmail.com>
| # Set a wide console width for snapshot tests to prevent line-wrapping | ||
| # differences across platforms. Call inside test_that() blocks before | ||
| # expect_snapshot() — withr::local_options() resets automatically on exit. | ||
| local_wide_snapshot <- function(width = 220, .local_envir = parent.frame()) { | ||
| withr::local_options(list(width = width), .local_envir = .local_envir) | ||
| } |
There was a problem hiding this comment.
@llrs-roche do you think we need this? is this the right place?
There was a problem hiding this comment.
I don't think we need a function for this.
If you want to set a fixed width, we can directly add the call to withr on each test.
If there were more options or other configurations for the snapshots then it might make sense to have a specific function.
|
ready for review @shajoezhu @llrs-roche (Jan is out till monday I believe) |
a6da1a5 to
8c7f6de
Compare
Signed-off-by: Davide Garolini <dgarolini@gmail.com>
Signed-off-by: Davide Garolini <dgarolini@gmail.com>
llrs-roche
left a comment
There was a problem hiding this comment.
I have yet to check the PR locally or verified that the issue is indeed fixed but I have reviewed the code.
See comments below for some concerns
| # apply visual formatting to the table body | ||
| x <- x |> | ||
| gtsummary::modify_table_body( | ||
| \(table_body) { |
There was a problem hiding this comment.
We should consider if these anonymous functions can be extracted from the functions, simplified, and reused across the package. Not to address it now but please open an issue
| # Set a wide console width for snapshot tests to prevent line-wrapping | ||
| # differences across platforms. Call inside test_that() blocks before | ||
| # expect_snapshot() — withr::local_options() resets automatically on exit. | ||
| local_wide_snapshot <- function(width = 220, .local_envir = parent.frame()) { | ||
| withr::local_options(list(width = width), .local_envir = .local_envir) | ||
| } |
There was a problem hiding this comment.
I don't think we need a function for this.
If you want to set a fixed width, we can directly add the call to withr on each test.
If there were more options or other configurations for the snapshots then it might make sense to have a specific function.
| ADSL_pipe <- cards::ADSL | ||
| ADAE_pipe <- cards::ADAE |> | ||
| dplyr::filter( | ||
| AESOC %in% unique(cards::ADAE$AESOC)[1:3], | ||
| AETERM %in% unique(cards::ADAE$AETERM)[1:3] | ||
| ) |
There was a problem hiding this comment.
At the beginning of the file there is already an ADSL and ADAE objects created. Aren't they enough? Why do we need these modifications here?
| test_that("tbl_with_pools() + tbl_hierarchical_rate_by_grade() + add_grade_column() pipeline works", { | ||
| # Regression test for the Cartesian join explosion. | ||
| # Previously, tbl_hierarchical_rate_by_grade() blanked the `label` column for | ||
| # grade rows before returning, causing tbl_merge() inside tbl_with_pools() to | ||
| # lose row uniqueness and produce a Cartesian cross-join. |
There was a problem hiding this comment.
The comment should be summarized and be the title of the test
| #' @export | ||
| add_grade_column <- function(x) { |
There was a problem hiding this comment.
Why is added with tbl_hierarchical_rate_by_grade() if it must be called also after any merging?
|
Do we need a special version of cards or anything? devtools::test(filter = 'add_grade_column')
[ FAIL 7 | WARN 0 | SKIP 0 | PASS 2 ] |
Fixes #226
Description
tbl_hierarchical_rate_by_grade()was blanking thelabelcolumn for grade rows (label = "") before returning. Whentbl_with_pools()calledtbl_merge(), the join on(variable, row_type, var_label, label)lost row uniqueness — every blank-label grade row matched every other, producing a Cartesian cross-join with massive row duplication and column suffixing.Fix: Decouple data generation from visual presentation:
tbl_hierarchical_rate_by_grade()now returns a structurally pristine table wherelabelretains unique grade text. Injectscustom_infometadata for downstream use.add_grade_column()(new exported function) applies all visual formatting (label_gradecolumn, label blanking, headers, indentation) as a post-processing step.New workflow
How to test
Run
devtools::document()thendevtools::test(). Accept new snapshots withtestthat::snapshot_accept().Files changed
R/tbl_hierarchical_rate_by_grade.Rcustom_infometadata injectionR/add_grade_column.Rtests/testthat/test-add_grade_column.Rtests/testthat/test-tbl_hierarchical_rate_by_grade.Radd_grade_column()tests/testthat/test-tbl_with_pools.R